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1 Introduction 

Several theories beyond the standard model (SM) of particle physics address the gauge hier- 
archy problem and other shortcomings of the SM by introducing a spectrum of new particles 
that are partners of the SM particles HHH- These new particles may include neutral, stable, 
and weakly interacting particles that are good dark-matter candidates. The identity and prop- 
erties of the fundamental particle(s) that make up dark matter are two of the most important 
unsolved problems in particle physics and cosmology. The energy density of dark matter is 
approximately five times larger than for the normal baryonic matter that corresponds to the 
luminous portion of the universe. A review on dark matter can be found in Ref. ]4|. 

Many dark-matter candidates are stable as a result of a conserved quantity. In supersymmetry 
(SUSY) this quantity is R parity, and its conservation requires all SUSY particles to be produced 
in pairs and the lightest SUSY particle (LSP) to be stable. Coloured SUSY particles can be pair- 
produced copiously at the Large Hadron Collider (LHC). These particles will decay directly 
into SM particles and an LSP or via intermediate colour-singlet states that ultimately decay 
into an LSP, resulting in a large amount of energy deposited in the detector. The LSP will pass 
through the detector without interacting, carrying away a substantial amount of energy and 
creating an imbalance in the measured transverse momentum (px)- 

Experiments at the Tevatron J5H3, SPS 00, LEP lH0Hl3l. and HERA colliders HHE3 have 
performed extensive searches for SUSY and set lower limits on the masses of SUSY particles. 
At the LHC, the CMS Collaboration has previously published limits in the all-hadronic channel 
based on a search using the ax |fl6| kinematic variable |17|. The ATLAS Collaboration has also 
published limits from a missing transverse momentum and multijet search [18] . 

In this paper, results are presented from a search for large missing transverse momentum in 
multijet events produced in pp collisions at a centre-of-mass-energy of 7TeV, using a data 
sample collected with the CMS detector at the LHC in 2010, corresponding to an integrated 
luminosity of 36 pb _1 . The results of the search are presented in the context of the constrained 
minimal supersymmetric extension of the standard model (CMSSM) [19], and in the more gen- 
eral context of simplified models |20T(2"2"| . These latter models are designed to characterize 
experimental data in terms of a small number of particles whose masses and decay branching 
fractions are allowed to vary freely. The results are independent of any more complete theory 
that addresses the deeper problems of particle physics, yet they can be translated into any such 
desired framework. 

This search is complementary to the CMS analysis fT7| that used the kinematic variable ccj as 
the search variable in events with at least two jets. That variable is very effective in suppressing 
the QCD multijet background but with some loss of signal acceptance. In contrast, this search 
only selects events with > 3 jets, and the missing and visible transverse momentum sums are 
used as search variables for an inclusive selection with a higher signal acceptance. 

The main backgrounds in this analysis are: (a) an irreducible background from Z+jets events, 
with the Z boson decaying to vv, denoted as Z(vv)+jets; (b) W+jets and tt events, with either the 
directly-produced W boson or one of the W bosons from the top-quark decays going directly or 
via a t to an e or p that is lost, or going to a T that decays hadronically. In all these cases, one or 
more neutrinos provide a genuine source of missing transverse momentum; and (c) QCD mul- 
tijet events with large missing transverse momentum from leptonic decays of heavy-flavour 
hadrons inside the jets, jet energy mismeasurement, or instrumental noise and non-functioning 
detector components. The relative contributions of these three categories of backgrounds de- 
pend on the event selection. 
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3 Sample selection 



This paper is organized as follows. The CMS detector and event reconstruction are described in 
Section|2j In Section|3} the event selection criteria are presented. The backgrounds to this search 
are directly determined from the data, in some cases with novel techniques which are being 
applied here for the first time. In SectionjlJ the irreducible Z(vv)+jets background is estimated 
from 7+jets events, and alternative Z and W control samples are studied. The background 
from W+jets and tt where a lepton is either lost or is a hadronically decaying tau lepton is 
estimated from ^+jets events by ignoring or replacing the muon, as discussed in Section|5] The 
QCD multijet kinematics are predicted using measured jet resolution functions to smear events 
obtained by a procedure that produces well-balanced events out of inclusive multijet data, 
as discussed in Section [6] As a cross-check, the correlation between the transverse missing 
momentum vector and the angular distance between that vector and the closest leading jet is 
used to predict the tail of the missing-momentum distribution. In Section |7j the interpretation 
of the observed data is presented. 

2 The CMS detector and event reconstruction 

The central feature of the CMS apparatus is a superconducting solenoid 13 m in length and 
6 m in diameter, which provides an axial magnetic field of 3.8 T. The bore of the solenoid 
is instrumented with various particle detection systems. The steel return yoke outside the 
solenoid is in turn instrumented with gas detectors which are used to identify muons. Charged 
particle trajectories are measured by the silicon pixel and strip tracker, covering < (p < 2n 
in azimuth and \t]\ < 2.5, where the pseudorapidity tj is defined as \] = — In [tan(#/2)], with 
9 being the polar angle of the particle's momentum with respect to the counterclockwise beam 
direction. A lead-tungstate crystal electromagnetic calorimeter (ECAL) and a brass/ scintillator 
hadronic calorimeter (HCAL) surround the tracking volume and cover the region \t]\ < 3. 
Quartz/steel forward hadron calorimeters extend the coverage to \tj\ < 5. The detector is 
nearly hermetic, allowing for momentum balance measurements in the plane transverse to the 
beam directions. A detailed description of the CMS detector can be found elsewhere II23II . 

All physics objects are reconstructed with a particle-flow technique pl|. This algorithm identi- 
fies and reconstructs individually the particles produced in the collision, namely charged and 
neutral hadrons, photons, muons, and electrons, by combining the information from the track- 
ing system, the calorimeters, and the muon system. All these particles are clustered into jets 
using the anti-fcr algorithm with a distance parameter of 0.5 |25] from FASTjET [26]. Jet ener- 
gies are corrected for the non-linear calorimeter response using calibration factors derived from 
simulation, and, for jets in data, an additional residual energy correction derived from data is 
applied f27\ . As the average number of additional pileup interactions during the LHC 2010 
data taking is roughly between two and three, no subtraction of the pileup energy deposits is 
performed. 

3 Sample selection 

The event selection for this search aims to be inclusive, such that it can detect new physics 
from any model yielding a high-multiplicity hadronic final state with missing transverse mo- 
mentum. Therefore, the observables of central interest in the search are chosen to be the mag- 
nitude of the missing transverse momentum flj calculated from jets, and the scalar sum of 
the jet transverse momenta Hj. The choice of these observables and the applied background 
suppression cuts aim for a minimal kinematic bias in the search for new physics signals. This 
facilitates the characterization of new physics in the case of a discovery. Furthermore, the selec- 
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tion is chosen to be efficient for models containing new particles with sufficiently small mass 
and thus sizeable production yield for the integrated luminosity used in this search. In this 
section, the event selection is described, based on the above considerations. 

3.1 Trigger selection and cleaning of the data sample 

The data used in this analysis were collected with triggers based on the quantity Hj 18 , defined 
as the scalar sum of the transverse momenta of reconstructed calorimeter jets (without response 
corrections) having pj > 20GeV and \rj\ < 5. The Hj lg threshold varied between 100 and 
150 GeV as the instantaneous luminosity of the LHC increased. The Hj trigger has a high 
acceptance for low-mass hadronic, new-physics signatures, and it enables the simultaneous 
collection of several control samples used to estimate the backgrounds. The trigger efficiency 



as a function of the particle-flow-based Hj, defined below in Section 3.2 is found to be close to 
100% for Hj values above 300 GeV. 

Ways to remove events with a poor Hj measurement were investigated using both simulation 
and data. Various sources of noise in the electromagnetic and hadronic calorimeters are re- 
jected 1 28.. 29 1. Beam-related background events and displaced satellite collisions are removed 
by requiring a well-reconstructed primary vertex within the luminous region, applying a beam- 
halo veto [29 1, asking for a significant fraction of tracks in the event to be of high quality, and 
requiring the scalar sum of the transverse momenta of tracks associated with the primary ver- 
tex to be greater than 10% of the scalar sum of the transverse momenta of all jets within the 
tracker acceptance. Events are also rejected in which a significant amount of energy is deter- 
mined to have been lost in the approximately 1% of non-functional crystals in the ECAL that 
are masked in reconstruction |28 | . Such losses are identified either by exploiting the energy 
measured through a parallel readout path used for the online trigger, or by measuring the en- 
ergy deposited around masked crystals when information from this parallel readout path is not 
available. The small inefficiency for signal events induced by this cleaning is discussed further 
in Section [711 



3.2 Baseline and search event selections 

The search selection starts from a loosely selected sample of candidate events. From this so- 
called baseline sample, tighter search selection criteria are then applied to obtain the final event 
sample. The baseline selection requirements are: 

• At least three jets with p T > 50 GeV and \rj \ < 2.5. 

• Hj > 300 GeV, with Hj defined as the scalar sum of the transverse momenta of all 
jets with p T > 50 GeV and < 2.5. 

• Hj > 150 GeV, with Hj defined as the magnitude of the negative vector sum of the 
transverse momenta of all jets with pj > 30 GeV and | rj \ < 5. This requirement 
suppresses the vast majority of the QCD multijet events. 

• \A(p(J n ,Hj)\ > 0.5, n = 1,2 and \A(p(J 3 ,Hj)\ > 0.3, where Acp is the azimuthal 
angular difference between the jet axis /„ and the Hj direction for the three highest- 
pj jets in the event. This requirement rejects most of the QCD multijet events in 
which a single mismeasured jet yields a high-j?x value. 

• No isolated muons or electrons in the event. A loose lepton definition is employed 
to reject the leptonic final states of tt and W/Z+jets events. Muons and electrons are 
required to have pj > 10 GeV and produce a good quality track that is matched to 
the primary vertex within 200 f/m transversely and 1 cm longitudinally. They must 
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4 Z(vv)+\ets background estimation 



also be isolated, requiring a relative isolation variable to satisfy: 

j^AR<0A ^charged hadron _|_ £-AR<0.4 ^neutral hadron _|_ £-AR<0.4 ^photon j pl e P ton < Q 2 

where p T char s ed hadron , p T neutrai had™ and ^photon ar£ ^ respectively, the momentum 
of charged hadrons, neutral hadrons, and photons in the event within a distance 
AR = 0.4 in rj-(p space of the lepton. Muons are required to have \t]\ < 2.4, whereas 
electrons must have \rj\ < 2.5, excluding the barrel-endcap transition region 1.44 < 
\rj\<\57. 

Two search regions are chosen, based on the observables central to this inclusive jets-plus- 
missing-transverse-momentum search. The first selection, defining the high-j?x search region, 
tightens the baseline cuts with an fLj > 250 GeV requirement, motivated by the search for a 
generic dark-matter candidate, which gives a large background rejection. The second selection 
adds a cut of Hj > 500 GeV to the baseline criteria, yielding the high-Hj search region, which is 
sensitive to the higher multiplicities from cascade decays of high-mass new-physics particles. 
Such cascades lead to more energy being transferred to visible particles and less to invisible 
ones. 

3.3 Data-simulation comparison 

Several Monte Carlo (MC) simulation samples are used, produced with a detailed CMS detec- 
tor simulation based on Geant4 [30 [. Samples of QCD multijet, tt, W/Z+jets, 7+jets, diboson, 
and single-top events were generated with the PYTHIA6 |3T| and MadGraph II32II generators 
using the CTEQ6.1L [33 1 parton distribution functions. For the tt background an approximate 
next-to-next-to-leading-order (NNLO) cross section of 165 pb Il34l is used, while the cross sec- 
tions for W(£v)+jets (31300pb) and Z(vi7)+jets (5 769pb) are derived from an NNLO calcu- 
lation with FEWZ |35|. While already excluded |Q7|, the LM1 CMSSM point J36J is used as 
a benchmark for new physics in this search. This point has a cross section of 6.5 pb at NLO, 
calculated with PROSPINO |37 [ . It is defined to have a universal scalar mass mo = 60 GeV, uni- 
versal gaugino mass mi/2 = 250 GeV, universal trilinear soft SUSY-breaking-parameter Aq = 0, 
the ratio of the vacuum expectation values of the two Higgs doublets tan j5 = 10, and the sign 
of the Higgs mixing parameter sign(p) positive. The squark and gluino masses for LM1 are 
respectively 559 GeV and 611 GeV, and the LSP mass is 96 GeV. 

The event yields in the data and the simulated samples after two loose selections, the baseline 
selection, and the two different search event selections are summarized in Table [lj where the 
simulated event yields correspond to an integrated luminosity of 36 pb -1 . The flj and Hj 
distributions for data and MC simulation are compared in Fig.[l]after the baseline selection. In 
the following sections, however, all the backgrounds in this search are estimated directly from 
data. 



4 Z(vv)+\eXs background estimation 

The production of a Z boson and jets, followed by the decay of the Z boson into neutrinos, con- 
stitutes an irreducible background. The first method to estimate this background from the data 
exploits the electroweak correspondence between the Z boson and the photon at high pj, where 
they exhibit similar characteristics, apart from electroweak coupling differences and asymptot- 
ically vanishing residual mass effects II38II . The cross-section ratio between the Z-boson and 
photon production provides a robust prediction of the missing transverse momentum spec- 
trum for invisible Z bosons at high pj, where the photon production cross section is asymptot- 
ically about 20% less than the one for inclusive Z-boson production. One important distinction 



5 



Table 1: Event yields in data and simulated samples were produced for five different selection 
criteria. The latter are normalized to an integrated luminosity of 36 pb -1 . All simulated sam- 
ples were generated with the pythia and MadGraph generators. The row labeled LM1 gives 
the expected yield for the benchmark supersymmetric model described in the text. 
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Figure 1: The (left) flj and (right) Hj distributions for the data and MC simulation samples 
with all baseline selection cuts applied except the flj and Hj requirements, respectively. The 
distributions for the individual backgrounds are shown separately along with the predicted 
distributions for the LM1 SUSY point. However, these simulated distributions are not used to 
estimate the backgrounds in this analysis. Instead, the backgrounds are determined directly 
from the data. 



between photon and Z-boson production arises from the breakdown of the leading-order cal- 
culation of the 7+jets process for small-angle or vanishing-energy emission of the photon in the 
absence of a mass to regularize the resulting divergences. This can be mitigated by imposing 
isolation requirements on the selected photon sample. 

The 7+jets control sample is collected using single-photon triggers, which were measured to 
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be fully efficient for events passing the baseline selection. In the offline selection, photon can- 
didates are distinguished from electrons by a veto on the presence of a track seed in the pixel 
detector. Photons from QCD multijet events are suppressed by requiring them to be isolated 
and the shower shape in the tj coordinate to be consistent with that of a single photon [39|. 

For the derivation of the Z/7 cross-section correction factor, simulated 7+jets and Z —¥ vv 
MadGraph samples are used, in addition to the QCD multijet, W/Z+jets, and tf samples. 
The contribution of fragmentation photons, which do not have a counterpart in the massive Z- 
boson production, is estimated from NLO JETPHOX gO] calculations to be (5 ±1)% (41] in the 
selected photon sample. A second background arises from isolated neutral pions and tj mesons 
decaying to pairs of secondary photons. For high-momentum mesons, these photon pairs are 
sufficiently well collimated to be reconstructed as a single photon. Using a method that fits a 
photon isolation observable to the expected distributions for real and background photons, the 
purity of the prompt photon sample is found to be (94+g)% after the baseline selection, which 
is in good agreement with simulation. Finally, the background from electrons mis-identified as 
photons is measured with Z — > e + e data events and is found to be negligible for the search 
selections. 

In order to predict the number of Z(vv)+jets events passing the search selections, the selected 
7+jets control sample needs the following corrections after the background subtraction. First, 
the cross-section ratio between the Z(vv)+)ets and 7+jets processes is estimated from simu- 
lation. The photon selection and isolation cuts are applied to the simulated samples when 
estimating this correction factor, hence folding the detector acceptance correction into this Z/ 7 
correspondence. The correction factors for the baseline, the high-^j, and the high-Hj selections 
are 0.41 ± 0.03, 0.48 ± 0.06, and 0.44 ± 0.06, respectively, where the uncertainties are statistical 
only. The uncertainty on the acceptance is taken as 5% |[T7fl, while the theoretical uncertainty is 
estimated from a comparison of leading to next-to-leading-order calculations of the ratio of Z 
and 7 production with two jets |42| . This dedicated calculation was performed for the different 
selections in this analysis adapted to only two jets. The addition of an extra jet is not expected to 
induce a significant effect. This leads to a 10% theoretical uncertainty on the Z/ 7 cross-section 
ratio for the baseline selection, which is taken as a uniformly distributed systematic uncertainty 
with a standard deviation of 6%. The photon reconstruction inefficiency is estimated in Ref. [41~| 
to be (3.5 ± 1.4)%. Finally, the photon identification and isolation efficiency is corrected for the 
difference between data and simulation. The correction is determined |39] to be 1.01 ± 0.02, 
after baseline selection. 

In Table |2]the full list of corrections is summarized for the baseline and search selections, along 
with the corresponding systematic uncertainties. The results for the Z(vi/)+jets prediction from 
the 7+jets control sample are summarized in Table [3] The prediction is in good agreement with 
the one found directly from the MC simulation, also given in in Table |3j 

A potential alternative method to estimate the Z(vv)+jets background in a conceptually more 
straightforward way uses Z(£ + £~)+jets data events. By counting the pair of leptons as missing 
transverse momentum, the topology of the Z — > vv process can be reproduced, and all jet- 
related selection criteria can be directly applied. Only a small number of Z(£ + £~)+jets events 
pass the selection criteria in the currently available data. After the baseline selection, applying 
Z — > £ + £~ selection requirements and correcting for the acceptance, efficiencies, and different 
branching fractions, the predicted Z — > vv rates are found to be compatible with the simulation 
predictions within uncertainties. However, none of the Z -4 e + e and Z — > }i + }i~ events pass 
either of the search selections. 

More events can be used for predicting the Z(i/v)+jets background by using W(^i/)+jets events. 
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Table 2: Overview of all correction factors and corresponding systematic uncertainties for the 
prediction of the Z(vv) +jets background from the 7+jets control sample for each of the selec- 
tions. 





Baseline 


High-^ T 


High-H T 




selection 


selection 


selection 


Z/ 7 correction ±theory 


0.41 ±6% 


0.48 ±6% 


0.44 ±4% 


±acceptance 


±5% 


±5% 


±5% 


±MC stat. 


±7% 


±13 % 


±13% 


Fragmentation 


0.95 ±1% 


0.95 ±1% 


0.95 ±1% 


Secondary photons 


0.94 ±9% 


0.97 ±10% 


0.90 ±9% 


Photon mistag 


1.00 ±1% 


1.00 ±1% 


1.00 ±1% 


Photon identification and 
isolation efficiency 


1.01 ±2% 


1.01 ±2% 


1.01 ±2% 


Total correction 


0.37 ±14% 


0.45 ±18% 


0.38 ±17% 



Table 3: Number of 7+jets events in the data and the resulting estimate of the Z(i/v)+jets back- 
ground, as well as the prediction from the MC simulation, for each of the selections, with their 
statistical and systematic uncertainties. The estimate from data is obtained by multiplying the 
number of events in the 7+jets sample with the total correction factor from Table |2| 





Baseline 
selection 


High-|? T 
selection 


High-H T 
selection 


7+jets data sample 

Z — > vv estimate from data 

Z — > vv MC expectation 


72 

26.3 ±3.2 ±3.6 
21.1 ±1.4 


16 

7.1 ±1.8 ±1.3 
6.3 ±0.8 


22 

8.4 ±1.8 ±1.4 
5.7 ±0.7 



This third method requires additional corrections for the W-Z correspondence and the tt con- 
tamination in the £+jets control sample. With the available data, a few events are selected in the 
control samples for the search regions. The predicted number of Z(i/t/)+jets background events 
from this method is consistent with the predictions from the 7+jets events and the simulation. 



5 W and tt background estimation 



The muon and electron vetoes described in Section 3.2 aim to suppress SM events with an 
isolated lepton. The W+jets and tf events, however, are not rejected by this lepton veto when 
a lepton from a W or top-quark decay is outside the geometric or kinematic acceptances, not 
reconstructed, not isolated (these three cases are denoted as a "lost lepton"), or is a tau lepton 
that decays hadronically (denoted as "%). In this section, two methods are presented to estimate 
these two components of the W+jets and tt backgrounds from data. The first method uses 
a ji+jets control sample, after correcting for lepton inefficiencies, to estimate the number of 
events that fail the isolated lepton reconstruction. The other method predicts the hadronic t 
background from a similar }i+)ets control sample by substituting a t jet for the muon. For both 
methods the chosen f/+jets control sample fully represents the hadronic and other properties of 
the background it predicts. 

The sum of the lost-lepton and hadronic- t predictions yields an estimate for the sum of the 
W+jets and tf background. The tf contribution is also measured separately as a cross-check. 
The method predicts the tf background from a b-tagged control sample by correcting for the b- 
tag efficiency, acceptance, and the residual Z, W, and multijet contamination. Using the W-to-tt 
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ratio predicted by simulation, the result is found to be consistent with the estimates described 
in the subsequent sections. 

5.1 The W/tt — > e,ji+X background estimation 

The background from W+jets and tt events, where a W boson decays into a muon or an electron 
that is not rejected by the explicit lepton veto, is measured using a muon control sample. This 
control sample is selected by requiring exactly one muon that is isolated and passes the iden- 



tification quality cuts discussed in Section 3.2 From simulation, more than 97% of this sample 
are W+jets and tf events. In order to estimate the number of events in the signal region with 
non-isolated, but identified electrons and muons, events in the isolated-muon control sample 

f e e *\ /i-<r e ''' \ eri 

are weighted according to I -^P- I ( — jr^ I , where e,c are the electron and muon isolation effi- 

V e iD / V e ISO / 

ciencies and the corresponding identification efficiencies. To model the number of events in 
the signal sample containing non-identified electrons or muons, the control sample is corrected 

by the factor 



ISO 



The lepton isolation efficiency is measured from Z — > £ + £~ events using a tag-and-probe 
method |43| as a function of lepton pj and the angular distance between the lepton and the 
nearest jet. The lepton identification efficiency is parametrized as a function of lepton pj and 
tj. Using these parametrizations, the efficiencies measured in Z events can be applied to the 
kinematically different W+jets and tt events. The remaining differences in the pj and t] spectra 
of the signal and control regions are found to be smaller than 10% in the simulation. 

Leptons can be out of the acceptance because either their transverse momentum is too small 
or they are emitted in the forward direction. Electrons and muons from t decays in particular 
tend to have low momentum, while the additional neutrinos add to the flj of the event. The 
ratio R Ac cept of events with out-of-acceptance leptons to those within the acceptance is estimated 
using simulation. The same muon control sample described above is used, weighted by R Ac ce V i 
and corrected for the isolation and identification efficiencies, to estimate the background from 
out-of-acceptance leptons. 

The dominant uncertainties on the lost-lepton prediction arise from the statistical uncertainties 
of both the control sample and the Z sample from which the lepton efficiencies are measured. A 
systematic uncertainty is assigned to the kinematic differences between the control and signal 
regions that remain after the lepton-efficiency parametrization. The residual presence of QCD, 
Z, or diboson events in the control sample is taken into account as a systematic uncertainty. 
Finally, the systematic uncertainty due to the use of the simulation in the acceptance correction 
is considered. All uncertainties are summarized in Table |4] The total systematic uncertainty 
amounts to approximately 18%. 



Table 4: Systematic uncertainties for the prediction of the lost-lepton background from the 
f/+jets control sample. 



Isolation & identification eff. 

Kinematic differences between W, tt, Z samples 

SM background in p. control sample 

MC use for acceptance calculation 


-13% +14% 
-10% +10% 
-3% +0% 
-5% +5% 


Total systematic uncertainty 


-17% +18% 



The prediction from this method applied to the muon control sample collected using the same 



5.2 The W/tt — s> r h +X background estimation 
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Hj triggers as for the search is compared in Table [5] to a prediction from simulated W+jets and 
tt events using the same method, and to the direct prediction from two different MC simula- 
tions. When applied to simulation, the method reproduces within the uncertainties the direct 
expectations from the simulation. Using the prediction from data after the baseline selection, 
about 50% more events are predicted than expected from the PYTHIA and MadGraph simu- 
lated samples. The difference is due to the generator parameter tune in the MC samples that 
were used to perform the comparison. 

Table 5: Estimates of the number of lost-lepton background events from data and simulation 
for the baseline and search selections, with their statistical and systematic uncertainties. 





Baseline 
selection 


High-^ T 
selection 


High-H T 
selection 


Estimate from data 
Estimate from MC (pythia) 
MC expectation (pythia) 
Estimate from MC (MadGraph) 
MC expectation (MadGraph) 


33.0 ± 5.5 tfy 
22.9 ±1.3 til 

23.6 ±1.0 
22.9 ±1.4 +H 

23.7 ±0.8 


4.8 ±1.8 ±gj 
3.2 ± 0.4 1 J| 

3.6 ±0.3 

2.7 ±0.4 +° Q \ 
3.4 ±0.3 


10.9 ± 3.0 +\j 
72±Q.7+_\\ 
7.8 ± 0.5 

5.4 ±0.5 t°dl 

6.5 ±0.5 



5.2 The W/tt — > Th+X background estimation 

Hadronically decaying tau leptons constitute an important second component of the W and tt 
background. In this section a method is described to estimate the hadronic-T background from 
a }i+)ets control sample, mainly composed of W(^i/)+jets and tt(^v)+jets events. This muon 
control sample is selected from data collected with single-muon triggers, ensuring indepen- 
dence from the hadronic activity in the event. Events are required to have exactly one muon 
with pj > 20GeV and \t]\ < 2.1 and to satisfy the identification and isolation requirements 
described in Section [3] 

Jets from tau leptons are characterized by a low multiplicity of particles, typically a few pions 
and neutrinos. The hadronic properties of events in the hadronic-T background are identical 
to those of the muon control sample, except for the fraction of the T-jet energy deposited in 
the calorimeters. To account for this difference, each muon in the control sample is replaced 
by a t jet. The momentum of this t jet is obtained by scaling the muon momentum by a fac- 
tor obtained from a simulated energy response distribution that models the fraction of visible 
momentum as a function of the true lepton momentum |44l 1451 . The extra jet is then taken 
into account when applying the selection cuts to obtain the hadronic-T background prediction 
from these modified events. In order to probe the full response distribution, this procedure is 
repeated multiple times for each event. 

A correction is applied for the kinematic and geometric acceptances of the muons in the con- 
trol sample. It is determined by applying a muon smearing procedure to events in W and 
tt simulated samples with a muon from W decay passing the muon kinematic selection, and 
comparing the resulting yield to the one obtained using all muons from W decay in the same 
events. The resulting correction factor is 0.84 ± 0.05 for the baseline and high-^x selection, and 
0.89 ± 0.05 for the high-Hj selection. A second correction takes into account the muon trigger, 



reconstruction, and isolation efficiencies. The same procedure described in Section 5.1 is fol- 
lowed. A correction is also applied for the relative branching fractions of W decays into muons 
or hadronic T jets. For the simulated events a factor of 0.65 is used in the generation of the 
events and as the correction factor, while for data a factor of 0.69 is applied ||46|. 
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The procedure for predicting the hadronic-T background was tested on simulated W and tt 
events and reproduces the direct results from the simulation of genuine hadronic tau leptons 
from W and tt decays within uncertainties. For the baseline selection this uncertainty amounts 
to 12% and 3% for the W and tf samples, respectively. The evaluation of the statistical uncer- 
tainty on the prediction needs special attention owing to the multiple sampling of the response 
distribution. This uncertainty is evaluated with a set of pseudo-experiments using the so-called 
bootstrap technique fl47| . 

The systematic uncertainties and their impact on the prediction are summarized in Table [6] 
The possible difference between data and simulation for the T energy distribution is taken into 
account as a systematic uncertainty estimated by scaling the visible energy fraction by 3% Il48| , 
resulting in a variation in the flj prediction of 2%. Possible SM background contamination 
in the muon control region comes from Z — > , tt/W+X — > rt/+X — > fiv+X, and from 

QCD multijet events. The first two are subtracted using the MC simulation, while the QCD 
multijet background is studied using an orthogonal control sample of events with non-isolated 
muons. The main source of background is W — > tv — > ]iv, estimated to be 10% of the total 
control sample. The number of W/tt — > ih+X events predicted in data using this method is 
summarized in Table [7] for the different signal regions. 

Table 6: Systematic uncertainties for the hadronic-T background prediction from the f/+jets 
control sample for the baseline and search selections. 





Baseline 
selection 


High-^x 
selection 


High-Hx 
selection 


t response distribution 
Acceptance 

Muon efficiency in data 
SM backgr. subtraction 


2% 
+6%/ -5% 
1% 
5% 


2% 
+6%/ -5% 
1% 
5% 


2% 
+6%/ -5% 
1% 
5% 



Table 7: Predicted number of hadronic-T background events from data and simulation for the 
baseline and search selections, with their statistical and systematic uncertainties. 





Baseline 
selection 


High-# T 
selection 


High-H T 
selection 


W/tt — > Th estimate from data 
W/tf -> T h MC expectation 


22.3 ±4.0 ±2.2 
19.9 ±0.9 


6.7 ±2.1 ±0.5 
3.0 ± 0.4 


8.5 ±2.5 ±0.7 
5.5 ±0.5 



6 QCD background estimation 

Two methods are employed to estimate the QCD multijet contamination in this analysis. The 
"rebalance-and-smear " (R&S) method estimates the multijet background directly from the data. 
This method predicts the full kinematics in multijet events, while being unaffected by events 
with true missing transverse momentum, including the potential presence of a signal. Crucial 
inputs to this method are the jet energy resolutions, which are measured from data, including 
the non-Gaussian tails. The "factorization method" provides an alternative prediction for the 
QCD multijet background, based on the extrapolation from a lower-^j control region to the 
high-^x search region using the correlation between flj and an angular variable. 



6.1 The rebalance-and-smear method 
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6.1 The rebalance-and-smear method 

Large missing transverse momentum arises in QCD multijet events when one or more jets 
in the event have a jet energy response far from unity, where the jet energy response is the 
ratio of the transverse momentum of the reconstructed jet over the one which would result 
from measuring perfectly the four-momenta of the particles in the jet ("particle jet"). The R&S 
method is essentially a simplified simulation where the jet energy response is modelled by a 
parametrized resolution function, which is used to smear a sample of "seed events" obtained 
from data and consisting of "seed jets" that are good estimators of the true particle-jet momenta. 

The seed events are produced in the "rebalance" step with an inclusive multijet data sample 
as input. Using the resolution probability distribution r, these seed events are constructed by 
adjusting the jet momenta in events with n jets given the likelihood C = Y\a=i r (Pjf \Pj X i e )' 
where pjf° and pj" e are the reconstructed and true jet transverse momentum, respectively. 
The likelihood is maximized as a function of p j" e , subject to the transverse momentum balance 
constraint Ya=\ Pjf e + Pisoft = ^ Here, all clustered objects with pj > 10 GeV are classified 
as jets and Px™ e ft , which is the true momentum of the rest of the event, can be approximated 
by the measured p{ e s C Q ft that comprises all particles not included in the jets. In other words, in 
the rebalancing step all of the jet momenta are adjusted, given the measured jet momenta and 
the jet resolution functions, to bring the event into transverse momentum balance. This forces 
events with genuine high flj from neutrinos or other undetected particles to be similar to well- 
balanced QCD-like events. As such, tf, W+jets, and Z+jets events, and also contributions from 
new physics, if any, have negligible impact on the background prediction since their production 
rate is much smaller than the QCD multijet production rate. 

Most of the events in the seed sample consist of jets with responses well within the core of the 
resolution distribution. Because of this, the Gaussian resolution model is sufficient in the com- 
putation of the likelihood. A correction is needed, though, because jets near masked ECAL cells 



have an energy response below unity, as discussed in Section 6.2 and hence get systematically 
rebalanced to too-low energies. To mitigate the resulting bias, each event is randomized in (p 
after rebalancing, such that well-rebalanced events dominate everywhere. A second correction 
to the rebalancing procedure is applied to account for the migration of reconstructed events 
towards higher Hj due to residual resolution smearing effects. An empirical term is included 
in the likelihood function to compensate for this migration. This correction induces less than a 
5% change in the resulting distributions from data. 

Next, the momentum of each seed jet is smeared using the jet resolution distribution. The 
search requirements can be applied to the resulting smeared events to predict all event-by- 
event jet kinematic properties and correlations. This allows for flexibility in the set of observ- 
ables used to define the search region, and in characterizing an observed signal. The distribu- 
tions predicted by the R&S procedure are compared with those from MC simulation in Fig. [2] 
The predicted Hi and Hj distributions are within 40% of the actual MC distributions in the 
search regions; the corresponding numbers of events are listed in Table [8] 

6.2 Jet response distributions 

For smearing, and therefore the prediction of the Hi spectrum, the full resolution functions 
including the non-Gaussian tails are used. The tails of the jet response function are of particular 
importance for the prediction of the QCD multijet background at high flj. 

The jet momentum resolution functions are parametrized using simulated PYTHIA dijet sam- 
ples and adjusted to match the measurements from data, as described below. The response 



12 



6 QCD background estimation 



> 

CD 

o 

LiO 



CD 
> 
LU 



10' 
10 4 
10 3 
10 2 
10 

1 

10 1 

IO' 2 
IO" 3 
lO" 4 

io- 5 

IO' 6 
IO" 7 
10 s 



' H T > 300, Ao 12 > 0.5, A<t> 3 ' >' 0.3 ' 
CMS Simulation 
\fs = 7 TeV 
L = 36 pb" 1 



R + S ! 
MC truth 



> 

CD 
O 

o 



10 



■= 10 1 



100 200 300 400 500 600 700 800 900 1000 

(GeV) 



10" : 



10" 



io- 



10" : 



IO"' 



io- 



Ht > 300, A(b 12 > 0.5, A(j) 3 > 0.3, > 150 
CMS Simulation 
\s = 7 TeV 
L = 36 pb" 1 



R + S 
MC truth 



500 1000 1500 2000 2500 3000 3500 4000 4500 5000 

H,. (GeV) 



Figure 2: The (left) flj and (right) Hj distributions from the R&S method applied to simulation 
events, compared to the actual MC distribution (MC truth), for events passing > 3 jets, Hj > 
300 GeV, and A</>(|? T , jet 1-3) selections, and additionally j? T > 150 GeV for the right plot. 

distributions are parametrized with respect to pj and rj. Furthermore, an exceptionally low 
response arises at the specific rj — (p locations where ECAL channels have been masked. This 
effect is taken into account by parametrizing the jet response as a function of the fraction /^ked 
of jet momentum lost in the masked area of the detector, computed using a template for the pj- 
weighted distribution of particles as a function of the distance in rj and <p to the jet axis. The 
dependence of the jet resolution on /tasked 1S shown in Fig. [3] (left). Finally, heavy-flavour b 
or c quarks and also gluons exhibit different jet resolution shapes than light jets, as shown in 
Fig. [3] (right). At high jet pj, decays of heavy-flavour hadrons into neutrinos become one of the 
dominant sources of significant jet energy loss. The jet resolution functions are determined for 
bottom, charm, gluon, and other light-flavour quarks separately. The flavour dependence is 
then accounted for by using these resolution functions in the smearing procedure according to 
the flavour fractions from simulation. 

Two methods are used to measure from data a scaling factor for the Gaussian core of the jet 
momentum resolutions determined from simulation ||49 | . At low pj, 7+jet events are used 
because the photons are reconstructed with excellent energy resolution and the pj balance 
makes the photons good estimators of the true pi scale of the event. At larger pj, dijet events 
are used. An unbinned maximum likelihood fit is performed on the dijet asymmetry, (pj* 1 — 

Pj t2 )/ (Px' 1 + Pt' 2 )' w ith random ordering of the two highest-px jets. For both measurements 
the presence of additional jets in the event destroys the momentum balance and an extrapola- 
tion to no-additional-jet activity is performed. These methods measure the core of the Gaussian 
resolution as a function of jet rj to be 5 — 10% larger in data compared to simulation, with sys- 
tematic uncertainties of similar size as the deviation. No significant dependence on the px of 
the jet is observed. 

No significant non-Gaussian tails are observed in 7+jet events. At higher pj, the dijet asym- 
metry distributions show compatibility within uncertainties between the resolution tails from 
data and simulation. Using the ratio of these asymmetry distributions in data and simulation, 
correction factors to the jet resolution tails from simulation are derived. 

Both a scaling of the response below ("low" tail) and above unity ("high" tail) can induce the 
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13 




Figure 3: Ratio of the reconstructed jet transverse momentum to the generated transverse mo- 
mentum for jets with p ® en > 300 GeV. Distributions are shown for (left) different values of 
/masked an ^ ("g^t) gluons and different quark flavours. 



same change in the asymmetry distribution; the latter arising for instance from mismeasured 
track momenta in particle-flow jets. Therefore, the nominal resolution functions are obtained by 
equally scaling both the lower and upper tails of the resolution in order to induce the observed 
scaling of the asymmetry tail. The envelope of the variations induced by only low- or high-tail 
scaling is taken as the systematic uncertainty band for the jet resolution distribution. 

6.3 Results of the rebalance-and-smear method 

The performance of the R&S procedure was validated using simulated pythia QCD multijet 
samples, without pileup interactions, where the parametrized response functions are derived 
from the same samples. The predicted and expected number of events are summarized for 
several selections in Table [8] Before the requirement, the prediction of the Hj spectrum, the 
jet kinematics, and the jet-jet angular and pj correlation distributions agree within 10% with 
the direct simulation. The flj distribution shows a bias up to 40%, which is mostly due to a 
dependency of the jet resolution on the presence of nearby jets. This is only of importance in the 
region of very high however, where the QCD multijet contribution is negligible compared 
to other backgrounds. 

Table 8: Number of events passing the various event selections from the pythia multijet sam- 
ple, the R&S method applied to the same simulated sample, and their ratio. The uncertainties 
quoted are statistical only. 





Baseline selection Baseline high-j?x high-Hj 
No Acp cuts selection selection selection 


N(pythia) 
N(R&S) 


138.6 ± 1.3 11.4 ± 0.4 0.13 ± 0.04 8.46 ± 0.32 
160.2 ± 0.1 13.2 ± 0.1 0.177 ± 0.004 9.57 ± 0.04 


N(R&S)/N(PYTHIA) 


1.16 ± 0.01 1.15 ± 0.04 1.4 ± 0.4 1.13 ± 0.05 



The QCD multijet background is predicted using the inclusive data sample with events passing 
the same Hj triggers described in Section |3.1| The R&S steps are then executed using the jet 
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Table 9: Number of QCD multijet events predicted with the R&S method, before and after bias 
corrections, along with all considered uncertainties and the type of uncertainty (uniform "box"- 
like, symmetric or asymmetric Gaussian distribution). Effects in italics are the biases corrected 
for as described in the text, with the full size of the bias taken as the systematic uncertainty. 







Baseline 


high-^?x 


high-Hj 






selection 


selection 


selection 


Nominal prediction (events) 


39.4 


0.18 


19.0 


Particle jet smearing closure 


(box) 


+14% 


+30% 


+7% 


Rebalancing bias 


(box) 


+10% 


+10% 


+10% 


Soft component estimator 


(box) 


+3% 


+19% 


+4% 


Resolution core 


(asymmetric) 


A°/ 

-25% 


_i_n°/ 

tU io 

-52% 


I 1 CO/ 

"T JLD Io 

-21% 


Resolution tail 


(asymmetric) 


+43% 
-33% 


+56% 
-78% 


+48% 
-34% 


Flavour trend 


(symmetric) 


±1% 


±12% 


±0.3% 


Pileup effects 


(box) 


±2% 


±10% 


±2% 


Control sample trigger 


(box) 


-5% 


-5% 


-5% 


Search trigger 


(symmetric) 


±1% 


±1% 


0% 


Lepton veto 


(box) 


±5% 


±0.05% 


±0.2% 


Seed sample statistics 


(symmetric) 


±2.3% 


±23% 


±3.3% 


Total uncertainty 


51% 


64% 


49% 


Bias-corrected prediction (events) 


29.7+15.2 


0.16 ±0.10 


16.0 ± 7.9 



energy resolution functions and the core and tail scale factors described in Section 6.2 The 
background predictions are obtained by applying the event selection requirements to the R&S 
events. The rejection efficiency of events with large energy loss in masked ECAL channels is 
modelled using a parametrized per-jet probability from simulation. 

In Table [9] the number of predicted events is listed for the baseline and search regions, along 
with the relevant systematic uncertainties. Corrections are applied to the background estimates 
for several known biases in the method, as summarized in Table |9j The largest one pertains to 
the smearing step, and arises from ambiguities in how the jet resolution is defined and from 
limitations in the parametrization. It is obtained in simulation by comparing the prediction 
from smeared particle jets with the corresponding one from the detector simulation. The size 
of the difference is taken as both a bias correction and a systematic uncertainty. 

A second bias is intrinsic to the rebalancing procedure, and is studied by iterating the R&S 
method. A first iteration (R&S) 1 of the method gives a sample of pure QCD multijet events 
with known true jet resolution, i.e., by construction the one used in the smearing step. Per- 
forming a second iteration (R&S) 2 of the method on this (R&S) 1 sample, using the same reso- 
lutions, provides a closure test of just the rebalancing part when compared to the input (R&S) 1 
events. The degree of non-closure is measured to be 10%, which is also assigned as a systematic 
uncertainty. 

The same (R&S) 2 / (R&S) 1 procedure is employed to study the bias caused by using p T e s c ° ft as 
an estimator of p T r " e ft - The true value of p T r " e ft in the second iteration is equal to the flj value 
calculated from the rebalanced jets in the first iteration. The difference between the (R&S) 2 
predictions with p T e s c ° ft and Pj r " e ft as input is used as a third bias correction, with corresponding 
systematic uncertainty. 
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The largest systematic effect arises from uncertainties on the jet momentum resolution. The 
measurement uncertainties on the core resolutions and non-Gaussian tails, discussed in Sec- 
tion 6.2 are propagated by repeating the R&S prediction with resolution inputs varied within 
these uncertainties. Another systematic uncertainty comes from the flavour-dependent para- 
metrization of the jet resolutions. It is evaluated as the difference between the use of PYTHIA 
and MADGRAPH simulated samples to derive the b- and c-quark content parametrization. 
These MC generators have heavy-flavour fractions that differ by roughly 25% for bottom and 
50% for charm quarks. Nevertheless, the difference in the resulting background prediction is 
very small in the high-Hj search regions, and the QCD multijet contribution is negligible in the 
high-|?x search region. 

The effect of pileup is studied by performing the R&S prediction with a subset of events with 
exactly one reconstructed primary vertex. The relative difference between this prediction and 
the one obtained from the inclusive sample is taken as a systematic uncertainty. 

Other smaller uncertainties arise from the event selection. A potential loss of events due to the 
Hj trigger requirement on the events that enter the rebalancing is quantified by comparing the 
prediction made with the small number of events collected with a low-px single-jet trigger. A 
conservative upper bound of 5% on this uncertainty is taken. Another uncertainty arises from 
the need to predict the number of smeared events that pass the search trigger. The Hj triggers 
used were measured on data to be fully efficient with respect to events passing the offline cuts, 
and the statistical upper bound from this measurement is taken as a systematic uncertainty 
for the low-Hj selections. Finally, the lepton veto has an uncertainty that is estimated as the 
full size of the rejection rate for QCD multijet events in a PYTHIA event sample with pileup 
conditions representative of the data. The large size of this uncertainty for the baseline search 
region is due to a near-100% statistical uncertainty induced by an MC sample with a very small 
equivalent luminosity. 

Variations within one standard deviation or within upper and lower bounds are performed 
for each systematic effect, and the corresponding differences in the predictions are quoted in 
Table [9] Estimated shapes of the probability distribution of each uncertainty are also listed; 
uncertainties that are estimated as upper bounds on possible effects are assumed to have a uni- 
form "box"-like distribution. The statistical uncertainty is associated with the size of the seed 
event sample. As prescribed by the bootstrap method [i47| , an ensemble of pseudo-datasets is 
selected randomly from the original seed sample, allowing repetition. The ensemble spread of 
predictions made from these pseudo-datasets is taken as the statistical uncertainty. 

After correcting for biases, the R&S prediction and systematic uncertainties are combined via 
the procedure explained in Section |7.1} which takes properly into account non-Gaussian dis- 
tributed uncertainties. The mean and r.m.s. deviation of the resulting distributions of the ex- 
pected number of multijet background events for the baseline and search selections are taken 
as the central values and uncertainties of the final R&S prediction, as given in the last row of 
Table |9j These central values are slightly shifted compared to the nominal bias-corrected values 
owing to the asymmetrically distributed uncertainties. 

6.4 The factorization method 

Because of the importance of estimating the QCD multijet background, an independent ap- 
proach is used as a cross-check. The factorization method uses the observables Hi and A(p m i n , 
of which the latter is the minimum azimuthal angle between the Hi direction and the three 
leading jets, to predict the number of events in the signal region of high Hi and large A</> m i n 
from the sideband regions where one or both variables are small. As Hi and A^min are not 
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independent observables, their correlation is measured in the low-flj region by means of the 
ratio r(yij) of the number of events with large A^ m j n to the number with small At^n. The 
number of background events is estimated from the extrapolation of r to the high-?fr signal 
region. 

The parametrization of r{flj) is chosen empirically, with two different ones being used. The 
first parametrization, the Gaussian model, predicts a Gaussian distribution for A</> m i n , assum- 
ing all jets, except the most mismeasured jet, to have an energy response following a Gaussian 
resolution function. The width of this distribution as a function of Hi is described both in sim- 
ulation and data by a falling exponential function, from which the functional form for r{fli) is 
derived. An additional constant term, determined from a MADGRAPH QCD multijet simula- 
tion, is added to r{fli) to keep it more-nearly constant at high values of flj- A large value of 
Hj is further required to suppress events with low-pj jets at low flj. This method results in a 
prediction for a lower limit on the number of expected QCD multijet background events in the 
signal region, since any non-Gaussian tails in the A</> m i n resolutions result in a larger estimate. 

As an alternative to the Gaussian resolution model, r{flj) is parametrized as an exponential 
plus the same constant term used in the Gaussian model. The same Hj cut is applied. The 
extrapolation to high flj leads to a larger r{flj) value than observed in the simulation. Various 
systematic variations of simulated QCD samples show that the true yield of the QCD multijet 
background is between the predictions from the two parametrizations. 

The dominant uncertainty on the prediction is the statistical uncertainty from the data in the 
control region and the statistical uncertainties on the fit parameters. A systematic uncertainty 
arises from the constant term at high flj for both models, which is estimated to be +11%/ —6% 
from a variety of different simulated samples. Further systematic uncertainties come from the 
SM background contamination in the control regions, +4%/ —8%, and from the Hj requirement 
discussed above, +0%/-ll%. 

The predictions for the QCD multijet background from the two parametrizations are given in 



Table 10 for the three different selections. The final background estimate is taken as the aver- 
age of the two model predictions, with half the difference assigned as an additional systematic 
uncertainty and added linearly to the uncertainty on the combination. The results are in agree- 
ment with the predictions using the R&S method. 



Table 10: Predictions for the number of QCD multijet background events using the factorization 
method with two different parametrizations and their combination, for the baseline and search 
selections, with their statistical and systematic uncertainties. 



Method 


Baseline selection 
Baseline selection 


High-^T 
selection 


High-Hx 
selection 


Gaussian model 
Exponential model 
Combined 


19.0 ±1.6 tli 
31.4 ± 2.4 tli 
25.2 ±2.4 t\H 


0.3 ± 0.1 t°oi 
0.5 ± 0.1 
0.4 ± 0.1 tU 


13.0 ± 1.3 ttl 
21.6 ±Z0if| 
17.3 ± 2.0 tti 



7 Results and interpretation 
7.1 Results and limits 



The number of events observed in data and the event yields predicted by the different back- 
ground estimation methods are summarized in Table 11 for the three different selections. The 
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total background is calculated summing the QCD R&S, the Z(vv)+jets from photons, and 
the W/ tt lost-lepton and hadronic-T estimates. No excess of events is observed in either the 
high-|?T or high-Hj search regions. 

Table 11: Predicted number of background events from the different estimates for the baseline 
and search selections, their total, and the corresponding number of events observed in data. 
The background combination is performed as explained in the text. The uncertainties shown 
include both statistical and systematic uncertainties. The last line gives the 95% confidence 
level (CL) upper limit on the number of possible signal events. 



Background process 


Baseline 
selection 


High-^ T 
selection 


High-H T 
selection 


Z(t/v)+jets (7+jets method) 
W/tt -> e,fi+X 
W/tt ->• T h +X 

QCD multijet (R&S method) 


26.3 ± 4.8 
33.0 ± 8.1 
22.3 ± 4.6 
29.7 ± 15.2 


7.1 ± 2.2 
4.8 ± 1.9 
6.7 ± 2.1 
0.16 ± 0.10 


8.4 ± 2.3 
10.9 ± 3.4 

8.5 ± 2.5 
16.0 ± 7.9 


Total background 


111.3 ± 18.5 


18.8 ± 3.5 


43.8 ± 9.2 


Observed in data 


111 


15 


40 


95% CL upper limit on signal 


40.4 


9.6 


19.6 



In order to derive limits on new physics, the expected number of signal events for the event 
selections are estimated using simulated signal samples, taking into account uncertainties on 
the event selection, theoretical uncertainties related to the event generation, and an overall lu- 
minosity uncertainty. Many of these uncertainties have a dependence on the event kinematics, 
and hence are model dependent. 

The largest experimental contribution to the uncertainties arises from the model-dependent jet 
energy scale and resolution uncertainties. These amount to 8% for the LM1 benchmark point. 
Smaller uncertainties are due to the lepton veto and the trigger. For the former a 2% uncer- 
tainty is determined for LM1; for the latter a conservative uncertainty of 1% is assigned. The 
inefficiency of the rejection of events with energy in masked ECAL cells is determined to be 
about 1.5% for the LM1 benchmark point. This full inefficiency is taken as the uncertainty, 
even though the ECAL masked-channel simulation reproduces well the effect in data [29|. For 
other event cleaning procedures, possible inefficiencies are determined in a low-ffr data control 
region to be negligible. Also the possible effect from the presence of additional pileup interac- 
tions corresponding to the LHC 2010 data-taking conditions was investigated and found to be 
insignificant. On the theoretical side, all uncertainties considered are model-dependent. The 
largest one is associated with the factorization and renormalization scale uncertainties on the 
next-to-leading-order cross-section corrections, yielding a 16% uncertainty for the LM1 point. 
Smaller contributions come from uncertainties on the parton distribution functions and initial- 
state radiation, respectively 3% and 2% for LM1. Final-state radiation uncertainties are found 
to be negligible. Finally, a luminosity uncertainty of 4% is accounted for [50|, along with the 
statistical uncertainty on the simulated signal samples, which is about 2% for the LM1 sample. 

The probability distributions corresponding to each uncertainty source, whether Gaussian, bi- 
furcated Gaussian, Poisson, or box shaped, are convolved using a numerical integration MC 
technique to obtain the probability distributions for each background and for the overall back- 
ground estimation. The presence of several sources of uncertainties makes the overall com- 
bination quite Gaussian in shape, as expected from the central limit theorem. The resulting 
distribution is fitted to a Gaussian function, and the mean and standard deviation are used as 
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the central value and uncertainty in the limit calculations described in the following sections. 
This last step is applied in order to obtain the best symmetric approximation to a distribution 
with a residual asymmetry. 

7.2 Interpretation within the CMSSM 

The parameters niQ and m\/2 of the CMSSM are varied in 10 GeV steps for three different values 
of tan /3 = 3, 10, and 50. Leading-order IsaJet [51 j signal cross sections are used and corrected 
by next-to-leading-order K factors calculated using PROSPINO |37|. The total signal efficiency 
including geometrical acceptance and selection efficiency varies over the CMSSM phase space, 
being in the range 20 — 30% for the high-Hx selection and 10 — 20% for the high-J?x selection, 
as shown in Fig. [4] 




Figure 4: Total signal efficiency for the Tfij (left) and Hj (right) selections, as a function of mo 
and nti/2- The other CMSSM parameters are tan /3 = 10, \i > 0, and Aq = 0. 

The expected upper limits on the CMSSM cross section are calculated using the background es- 
timate from data under the no-signal hypothesis. For the determination of the observed upper 
limit the signal contamination in the background estimate is corrected for. In the isolated-muon 
control region of the lost-lepton and hadronic-T methods, the signal contamination is calculated 
and removed from the background estimate for each CMSSM parameter point. For both selec- 
tions, the signal contributions to the background estimate are 2 — 3 events for the lost leptons 
and 1 — 2 events for the hadronic tau decays. The signal contamination in the 7+jets control re- 
gion is found to be negligible. The QCD multijet background estimation with the R&S method 
is not affected by signal contamination. 

The modified frequentist procedure CLs Ii52ll53l with a likelihood ratio test-statistic is used for 
the limit calculation. In Fig. [5] the observed and expected CLs 95% confidence level (CL) up- 
per limits are shown in the CMSSM mo-m 1 / 2 (left) and the gluino-squark (right) mass planes 
for tan /S = 10, }i > 0, and Aq = 0. The contours are the envelope with respect to the best 
sensitivity of both the Hj and the flj search selections. For mo < 450 GeV the Hi selection is 
more powerful, while for large mo the Hj selection is more important. A previously published 
search by CMS for supersymmetry in hadronic events fl7) using the event shape observable 
ctj ||T6ll is shown for reference. The ocj analysis aims at the best possible removal of the QCD 
multijet background, and is particularly powerful for small jet multiplicities and high miss- 
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ing transverse energy. Because of the high signal selection efficiency in a large fraction of the 
phase space, and in spite of the larger background compared to the kj selection, the analysis 
presented here is able to improve the limits previously set by the aj analysis. 




Figure 5: The expected and observed 95% CL upper limits in the CMSSM mty-m.1/2 (left) and 
gluino-squark (right) mass planes for LO and NLO cross sections. The ±1 standard deviation 
(a) band corresponds to the expected limit. The contours are the combination of the Hj and the 
flj selections such that the contours are the envelope with respect to the best sensitivity. The 
CMSSM parameters are tan /3 = 10, p > 0, and Aq = 0. The limit from the earlier CMS analysis 
is shown as a blue line and limits from other experiments as the shaded regions. For the area 
labeled "f LSP" the stau becomes the LSP. The LM1 SUSY benchmark scenario is shown as a 
point. 

7.3 Interpretation with Simplified Model Spectra 

Models for new physics can also be studied in a more generic manner using a simplified model 
spectra (SMS) approach |20T(22| . Simplified models are designed to characterize experimental 
data in terms of a small number of basic parameters. They exploit the fact that at the LHC the 
final-state kinematics of events involving strongly produced massive new particles are largely 
determined by the parton distribution functions and phase-space factors associated with two- 
and three-body decays. Using these simplified models, the experimental results can then be 
translated into any desired framework. 

For the simplified models used in this paper, it is assumed that the new particles are strongly 
produced in pairs whose decay chains ultimately result in a stable weakly interacting massive 
particle, denoted as LSP. The particles produced in the hard interaction can be identified as 
partners of quarks and gluons. In SUSY these would be the squarks (q) and gluinos (g). Even 
though the SMS are more generic, in the following everything is phrased for simplicity in terms 
of super-partner names. Two benchmark simplified models are investigated for the number 
of jets and fij signature in this analysis: pair-produced gluinos, where each gluino directly 
decays to two light quarks and the LSP, and pair-produced squarks, where each squark decays 
to one light quark and the LSP. In Fig. |6]the respective diagrams for these simplified models are 
drawn. To limit the set of SMS studied, only a few are chosen that can bracket the kinematic 
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properties of the different final states. For this reason the gluino-squark associated production 
is neglected. 




-- LSP 
-- LSP 



Figure 6: Diagrams of the studied simplified models. Left: gluino pair production; right: 
squark pair production. 

The simplified models are simulated with the PYTHIA generator |3lj, the CTEQ6L1 parton 
distribution functions [54], and the parametrized CMS detector simulation. For each topology, 
samples are generated for a range of masses of the particles involved, and thus more mass 
splittings are explored than in the CMSSM, where the ratio of the gluino and the LSP masses is 
approximately fixed. 

In the following, the measured cross section upper limits are compared to a typical reference 
next-to-leading-order cross section from PROSPINO 11371 . In the case of squark pair production 
this reference cross section corresponds to the squark-antisquark cross section with four light 
flavours included, with the gluinos becoming nearly decoupled at 3 TeV. This cross section is 
used to convert upper limits on the production cross section to reference limits on new-particle 
masses. 

In Fig. [7] the total signal efficiency of the high-^j selection, including geometrical acceptance 
and selection efficiency, is shown within the simplified model space for gluino and squark pair 
production, as a function of the gluino (left) or squark mass (right) and the LSP mass. Only 
the lower half of the plane is filled because the model is only valid when the gluino or squark 
masses are larger than the mass of the LSP. The signal selection efficiency increases for higher 
gluino and squark masses, and is low on the diagonal, where the mass splitting is small and 
jets are produced with lower transverse momentum. 

The limit calculation in the SMS space is performed using a Bayesian framework with a flat 
prior for the signal [46|. The same sources of uncertainties affecting the signal geometrical 
acceptance and selection efficiency are incorporated for each scan point as for the CMSSM in- 
terpretation, namely the jet energy scale and resolution, the lepton veto, the cleaning including 
the veto on large energy loss in masked ECAL cells, the trigger, the initial- and final-state ra- 
diation, the parton distribution functions, the luminosity, and the statistical uncertainty. The 
renormalization and factorization scale uncertainties do not apply here because they only in- 
fluence the normalization of the reference cross section. The presence of signal events in the 
background sample is not considered, since the studied SMS processes do not produce prompt 
leptons or photons, and since the R&S method is insensitive to such contamination. 

In Fig. [8] the exclusion 95% CL upper limits on the production cross sections are presented for 
the high-|?x search selection. This selection is found to be more sensitive than the high-Hx 
search selection for both considered simplified model spectra. Using this model-independent 
representation with the simplified model spectra, these upper limits on the cross section can be 
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Figure 7: Total high-|?x selection efficiency for gluino (left) and squark (right) production as a 
function of the gluino (left) or squark (right) mass and the LSP mass. 

translated into a limit on any complete model such as SUSY. 




m~ (GeV) m- (GeV) 



Figure 8: 95% CL upper limits on the gluino (left) and squark (right) pair-production cross 
sections for the high-^j selection, as a function of the gluino (left) or squark (right) mass and 
the LSP mass. The contours where the reference cross section and three times this cross section 
can be excluded are shown. 



8 Conclusions 

An inclusive search for new physics has been presented using events with a multijet signature 
with large missing transverse momentum. The observed event yield is consistent with the SM 
background contributions, arising mainly from Z(i/v)+jets, W(£v)+)ets, tt including a W that 
decays leptonically, and QCD multijet production. These SM contributions were estimated di- 
rectly from the data using several novel techniques, giving a minimal reliance on simulation. 
The overall uncertainty on the resulting total background prediction is dominated by the sta- 
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tistical uncertainty. 

In the absence of an excess of events above the expectation, upper limits are derived in the 
CMSSM parameter space. In K-parity conserving CMSSM with Aq = 0, ]i > 0, and tan f> = 10, 
a 95% CL upper limit on the production cross section in the range between 2 and 3 pb is ob- 
tained, depending on the squark and gluino masses considered. Gluino masses below 500 GeV 
are excluded at 95% CL for squark masses below 1 000 GeV. Similar results are obtained for 
other tan /3 values. The results are also more generically interpreted in the context of simplified 
models where final states are described by the pair production of new particles which decay 
either to one or two jets and a dark-matter candidate escaping detection. We obtain a 95% CL 
upper limit on the production cross section for such new particles in the range between 0.5 and 
30 pb, depending on the masses of the new particles in the decay chains. 
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